avoid deadlock on insufficient `param` bytes in `wrpc_runtime_wasmtime` #1045

SebastienGllmt · 2026-01-03T15:44:12Z

wRPC deadlocks/hangs indefinitely when a client calls invoke on a function via wrpc-runtime-wasmtime (ex: serve_function_shared) with insufficient encoded data for the expected params type. This can occur when:

You send an insufficient number of params (ex: send one param instead of two)
You send a param of the wrong type (that happens to have fewer bytes than the true type)

Why this happens

in the call function in wrpc-runtime-wasmtime

The client sends params as bytes over a stream (rx). These are untyped raw bytes
The server tries to parse the bytes according to the type it expects params_ty

You can see the reading from rx in a loop according to params_ty inside call here

However, if the contents of the stream doesn't match the expected params (ex: fewer params than expected, or there aren't enough bytes to finish the loop because maybe you passed the wrong type) this for loop will never terminate.

Why did this never get caught?

Other than it being an edge case, this bug only appears in the wrpc-runtime-wasmtime crate, while most tests use serve_values/invoke_values_blocking which do not go through this code path.

Why this matters

This is a denial-of-service (DoS) vulnerability. Malicious actors can exploit this to exhaust server resources (connections, memory, file descriptors) by sending malformed requests that hang indefinitely, making the server unavailable to legitimate clients.

How should it be fixed?

Internally, read_value checks which type it should use, then tries to read it (ex: read_u8().await). This then calls poll_read on the Incoming stream. If it's ever unable to parse the value because there aren't enough bytes left, poll_read deadlocks in the Pending state inside the ready! macro.

Example trace

TRACE ThreadId(31) read_value: reading struct field value i=0
TRACE ThreadId(31) ingress: reading path length
TRACE ThreadId(31) ingress: read path length n=0
TRACE ThreadId(31) ingress: reading data length
TRACE ThreadId(31) ingress: read data length n=1
TRACE ThreadId(31) ingress: reading data
TRACE ThreadId(31) ingress: read data buf=b"\x05"
TRACE ThreadId(31) read_value:read_value:poll_read: read buffer buf=[5]
TRACE ThreadId(31) read_value: reading struct field value i=1
TRACE ThreadId(31) read_value:read_value:poll_read: reading
TRACE ThreadId(31) read_value:read_value:poll_read: return=Pending
# ... hangs here indefinitely

Some solutions that don't work:

We can't modify read_u8 (or similar functions) to fail, as there is no way for read_value to know it's reached the end of the data, and certain types can have multiple encodings so it's hard for it to magically know when it's received enough data for certain data types.(wrong place to try to fix this)
We can't rely on the client to nicely end their stream or include any "end of stream" mark, because it wouldn't stop a malicious client from attacking our server as mentioned earlier (although it's not a necessarily a bad idea in order to have better errors in the happy path)
We can't solely rely on a timeout, because that's also fragile on poor connections (it can be part of the solution, but ideally not the only part)

Possible solution A: modify `ingress`/`egress` behavior

Goal: somehow indicate we've reached the end of the data

call does not actually receive data from the network directly. Rather, the data is first read in by an ingress in the server that processes the data first.

Notably, the connection is established as follows:
a. serve_function_shared serves a connection via Conn
b. invoke also connects via Conn

They then communicate via an egress in the client to an ingress in the server.

It's possible there is a way to make the ingress on the server side more aware of what is going on in the communication protocol so it can properly indicate the EOF (and maybe other protections like clients sending payloads beyond a size limit, timeouts, or whatever other solutions people prefer to protect their server) as currently it's just an infinite loop until it receives enough data (which in turn means call also stays blocked waiting for more data from the ingress, which it never receives)

Possible solution B (used by this PR): modify `poll_read`

Instead of trying to redesign the ingress (which probably is too big of a decision for me to make), I instead implemented a heuristic in poll_read to try to avoid deadlocks

Core idea:

If we keep data and then suddenly there is no data anymore. We assume it's not because of a slow connection and we've just run out of data to read, so we error. (*this.has_read_data branch)
If we never received any data at all, so we're always just getting Poll::Pending no matter how many times we try and get data. We assume it's because no data is ever coming, so we error (*this.pending_count > 1 branch)

Note: this is not perfect, because on slow connections this could cause an error even though the data really was coming. I think this is not too likely, and I'd prefer to accidentally error on slow connections than deadlock on improperly formatted calls.

SebastienGllmt · 2026-01-03T15:44:46Z

tests/deadlock.rs

note: I wasn't sure where to put this test, so I just put it in a standalone file at the root of tests so it's easy to move in case you want it in a specific place

SebastienGllmt · 2026-01-03T15:46:08Z

crates/transport/src/frame/conn/mod.rs

+                if diff_len == 0 {
+                    trace!("consumed empty frame, closing receiver");
+                    self.as_mut().get_mut().rx.take();
+                }


note: the old code did

if buf.filled().is_empty() { self.rx.take(); }

but theoretically, it's possible for buf to not be empty when this function is called. I think this is a subtle bug, so I updated it to check the diff (i.e. if we have new data) instead

SebastienGllmt · 2026-01-07T11:29:38Z

I also think that similarly, wRPC does not error on extra bytes. This makes some sense from a stream approach (just stop reading data from the stream when you no longer need it), but is a bit awkward if you want to proactively avoid errors by ensuring you process everything the user sent. Probably we get a fix for this for free if we had a way to indicate a way to denote the end of the stream though (basically, if we finish parsing and we haven't seen an "end of stream", we error)

rvolosatovs

Thank you for the contribution!

I acknowledge the problem you are trying to solve, however I don't think that the approach is valid in general case.

If what we are trying to solve here is preventing "zombie" connections then I think there is a fairly standard way to do it, which is introducing timeouts. For example, here's how wasi-http solves this problem: https://github.com/WebAssembly/WASI/blob/7e643518a4bf9767ca799d7aa776c560a2fc0351/proposals/http/wit/types.wit#L401-L428

If we follow the wasi-http approach, then we actually already have the connect timeout (https://docs.rs/wrpc-transport/latest/wrpc_transport/invoke/trait.InvokeExt.html#method.timeout), but what's lacking is:

first byte timeout
between byte timeout

(which really could just be a single timeout at least to start with)

Users would need to wrap the Incoming stream to add this functionality.

With a quick search I found https://docs.rs/tokio-io-timeout/latest/tokio_io_timeout/index.html, which seems to be doing exactly that.

How about we do this:

Implement AsyncRead for wrpc_transport::invoke::Timeout (where T: AsyncRead)
Implement AsyncWrite for wrpc_transport::invoke::Timeout (where T: AsyncWrite)
Implement wrpc_transport::Index for wrpc_transport::invoke::Timeout (where T: wrpc_transport::Index)
Wrap Incoming and Outgoing in Timeout (

wrpc/crates/transport/src/invoke.rs

Lines 91 to 92 in 39d34fe

type Outgoing = T::Outgoing;

type Incoming = T::Incoming;

)

We could use the tokio-io-timeout crate for the tokio I/O trait implementations if you wish, personally I would probably implement it directly in wrpc-transport without introducing a dependency.

Once that's done, we could have a follow-up PR, which implements the timeout method on ServeExt and does roughly the same.

How does this sound? Would this address your concern? Is the implementation plan clear?

I'm happy to jump on a call to discuss if you wish. (you can reach me on BA zulip)

rvolosatovs · 2026-01-12T15:22:18Z

crates/transport/src/frame/conn/mod.rs

+        // Track consecutive Pending results. If we get Pending multiple times after having
+        // attempted to read, it indicates we're waiting for data that will never arrive.


I don't think we can make this assumption and there are a few reasons for it:

Various async executors are free to poll futures as many times as they want until they return Ready. That can be done, in fact, as a valid optimization - polling a future once with a noop task context to short-circuit the case when it's ready, then constructing a "real" task context and polling again to register the task for wake up.

The underlying I/O stream is allowed to wake up spuriously. For example, see https://docs.rs/tokio/latest/tokio/sync/mpsc/struct.Receiver.html#method.poll_recv . Returning Poll::Pending 5 times and then returning Poll::Ready would be a perfectly valid behavior, which would be disallowed by this PR.

I don't think that counting the times that "pending" was returned is a valid approach in a general case.

rvolosatovs · 2026-01-12T15:27:07Z

tests/deadlock.rs

+                outgoing.flush().await?;
+                let mut buf = vec![];


Suggested change

outgoing.flush().await?;

let mut buf = vec![];

outgoing.flush().await?;

drop(outgoing);

let mut buf = vec![];

I don't think this usage is valid. By not sending the parameters in the initial request and keeping the stream open, the client signals that it will send parameters asynchronously. This is a fundamental part of the design, which allows us to implement e.g. future and stream types.

Closing the stream (by dropping it, for example) as expected, does not cause a deadlock on current main

rvolosatovs · 2026-01-12T15:59:57Z

Also, as a general note, please avoid committing Wasm files to this repository

avoid deadlock on params mismatch

e5a2e28

SebastienGllmt requested a review from rvolosatovs as a code owner January 3, 2026 15:44

remove unused test file

8f3a7ca

SebastienGllmt commented Jan 3, 2026

View reviewed changes

rvolosatovs requested changes Jan 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

avoid deadlock on insufficient `param` bytes in `wrpc_runtime_wasmtime` #1045

avoid deadlock on insufficient `param` bytes in `wrpc_runtime_wasmtime` #1045

Uh oh!

SebastienGllmt commented Jan 3, 2026 •

edited

Loading

Uh oh!

SebastienGllmt Jan 3, 2026

Uh oh!

SebastienGllmt Jan 3, 2026

Uh oh!

SebastienGllmt commented Jan 7, 2026 •

edited

Loading

Uh oh!

rvolosatovs left a comment

Uh oh!

rvolosatovs Jan 12, 2026

Uh oh!

rvolosatovs Jan 12, 2026

Uh oh!

rvolosatovs commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

		// Track consecutive Pending results. If we get Pending multiple times after having
		// attempted to read, it indicates we're waiting for data that will never arrive.

	type Outgoing = T::Outgoing;
	type Incoming = T::Incoming;

avoid deadlock on insufficient param bytes in wrpc_runtime_wasmtime #1045

Are you sure you want to change the base?

avoid deadlock on insufficient param bytes in wrpc_runtime_wasmtime #1045

Uh oh!

Conversation

SebastienGllmt commented Jan 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why this happens

Why did this never get caught?

Why this matters

How should it be fixed?

Possible solution A: modify ingress/egress behavior

Possible solution B (used by this PR): modify poll_read

Uh oh!

SebastienGllmt Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

SebastienGllmt Jan 3, 2026

Choose a reason for hiding this comment

Uh oh!

SebastienGllmt commented Jan 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rvolosatovs left a comment

Choose a reason for hiding this comment

Uh oh!

rvolosatovs Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

rvolosatovs Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

rvolosatovs commented Jan 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

avoid deadlock on insufficient `param` bytes in `wrpc_runtime_wasmtime` #1045

avoid deadlock on insufficient `param` bytes in `wrpc_runtime_wasmtime` #1045

SebastienGllmt commented Jan 3, 2026 •

edited

Loading

Possible solution A: modify `ingress`/`egress` behavior

Possible solution B (used by this PR): modify `poll_read`

SebastienGllmt commented Jan 7, 2026 •

edited

Loading